A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings
نویسندگان
چکیده
A key point in Sentiment Analysis is to determine the polarity of the sentiment implied by a certain word or expression. In basic Sentiment Analysis systems this sentiment polarity of the words is accounted and weighted in different ways to provide a degree of positivity/negativity. Currently words are also modelled as continuous dense vectors, known as word embeddings, which seem to encode interesting semantic knowledge. With regard to Sentiment Analysis, word embeddings are used as features to more complex supervised classification systems to obtain sentiment classifiers. In this paper we compare a set of existing sentiment lexicons and sentiment lexicon generation techniques. We also show a simple but effective technique to calculate a word polarity value for each word in a domain using existing continuous word embeddings generation methods. Further, we also show that word embeddings calculated on in-domain corpus capture the polarity better than the ones calculated on general-domain corpus.
منابع مشابه
funSentiment at SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs Using Word Vectors Built from StockTwits and Twitter
This paper describes the approach we used for SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs. We use three types of word embeddings in our algorithm: word embeddings learned from 200 million tweets, sentiment-specific word embeddings learned from 10 million tweets using distance supervision, and word embeddings learned from 20 million StockTwits messages. In our ap...
متن کاملA Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings
Learning word embeddings has received a significant amount of attention recently. Often, word embeddings are learned in an unsupervised manner from a large collection of text. The genre of the text typically plays an important role in the effectiveness of the resulting embeddings. How to effectively train word embedding models using data from different domains remains a problem that is underexp...
متن کاملfunSentiment at SemEval-2017 Task 4: Topic-Based Message Sentiment Classification by Exploiting Word Embeddings, Text Features and Target Contexts
This paper describes the approach we used for SemEval-2017 Task 4: Sentiment Analysis in Twitter. Topic-based (target-dependent) sentiment analysis has become attractive and been used in some applications recently, but it is still a challenging research task. In our approach, we take the left and right context of a target into consideration when generating polarity classification features. We u...
متن کاملBilexical Embeddings for Quality Estimation
This paper describes the SHEF submissions for the three sub-tasks of the Quality Estimation shared task of WMT17, namely: (i) a word-level prediction system using bilexical embeddings, (ii) a phrase-level labelling approach based on the word-level predictions, (iii) a sentencelevel prediction system using word embeddings and handcrafted baseline features. Results are promising for the sentence-...
متن کاملConvolutional Neural Networks for Sentiment Analysis on Italian Tweets
English. The paper describes our submission to the task 2 of SENTIment POLarity Classification in Italian Tweets at Evalita 2016. Our approach is based on a convolutional neural network that exploits both word embeddings and Sentiment Specific word embeddings. We also experimented a model trained with a distant supervised corpus. Our submission with Sentiment Specific word embeddings achieved t...
متن کامل